Introduction

In this RMarkdown file, I will perform the data analysis for my thesis. I will go over the hypotheses in more detail per section, but first shortly describe the experiment.

During the first session of the experiment, participants were presented with numerical facts such as “X out of 10 bus drivers are women”, and had to make a prediction (e.g. 2 out of 10 bus drivers are women). They then were given an answer (given_answer) that was either correct or manipulated to be obviously false or implausible (e.g. 8 out of 10 bus drivers are women). Participants then indicated how surprised they were on a scale of 1-5, and had the option to indicate that they thought the fact was manipulated to be implausible. After this training phase, participants were again presented with the same facts, and had to indicate what the answer was that had been presented to them earlier.

The second part of the experiment took place 2 days after the first one. Participants were again presented with numerical facts. Half of these facts were repeated from day 1, and the other half were new. Participants had to indicate what they thought the correct answer was, as well as how sure they were they had seen the fact before on a scale of 1-5 (definitely not seen before to definitely seen before).

This setup allowed us to investigate three kinds of learning:

  1. Immediate recall: how well do participants remember the given_answer they were just presented with?

  2. Delayed recognition: how well can participants indicate whether they have seen a given fact before?

  3. Do participants update their beliefs of the numerical facts presented to them, based on the given_answer?

Additionally, we are specifically interested in the influence surprise and plausibility have on both immediate and delayed recall. We will perform two multilevel analyses, one for both kinds of learning, and investigate whether surprise and plausbility lead to a better fit of the multilevel model. We expect that for every model, facts that are surprising but still plausible are remembered better than facts that are implausible or facts that are not surprising.

Setup

To prepare this file for conducting analyses, we set up the Rmarkdown settings, load the required libraries, and load in the data.

knitr::opts_chunk$set(echo = TRUE)
library(lme4) #for model
library(ggplot2) #for plots
library(ggpubr) #for arranging plots into a grid
library(car) #for performing the anova
library(psych) #for calculating correlations
library(dplyr) #for using %>% operator
library(tidyverse) #for using summarise function
library(emmeans) #for post-hoc analyses
library(sjPlot) #for table outputs of model results
library(effects) #for creating pretty plots
library(simr) #for running post-hoc power simulations
library(lattice) #for testing assumptions
library(sjPlot) #for testing assumptions
setwd("~/Library/CloudStorage/OneDrive-Personal/Sync/Studies/Research Master/Thesis/Data/4-Final version")

data <- read.table("full_data.csv", header=T, sep=";")

Item check

First, we look at different variables to make sure values are as expected. We do this for: given_answer, riginal_estimate, surprise, recognition_answer, estimate_day2, and familiarity_rating.

#First: making sure the data is preprocessed with numerical data
data_item <- data
data_item$item <- as.numeric(data_item$item)

#Since these variables occurred on day 1 only, we filter out all items from day 2
data_60 <- subset(data_item, item < 61)                          

#given answer
ggplot(data_60,aes(x=item,y = given_answer)) +
  geom_bar(stat = "summary", fun= "mean")

#original_estimate
ggplot(data_60,aes(x=item,y = original_estimate)) +
  geom_bar(stat = "summary", fun= "mean")

#surprise
ggplot(data_60,aes(x=item,y = surprise)) +
  geom_bar(stat = "summary", fun= "mean")

#recognition_answer
ggplot(data_60,aes(x=item,y = recognition_answer)) +
  geom_bar(stat = "summary", fun= "mean")

#estimate_day2
ggplot(data_60,aes(x=item,y = estimate_day2)) +
  geom_bar(stat = "summary", fun= "mean")
## Warning: Removed 675 rows containing non-finite values (stat_summary).

#familiarity_rating
ggplot(data_60,aes(x=item,y = familiarity_rating)) +
  geom_bar(stat = "summary", fun= "mean")
## Warning: Removed 675 rows containing non-finite values (stat_summary).


The familiarity rating is very high for all items, meaning that participants are very good at remembering what facts they have seen before. We expected this to vary more between items, and didn’t anticipate that the average would be this high. All other distributions of items appear to be as we expected, with nothing seeming out of the ordinary. We then perform a quick check of reaction times.

#First: subsetting data to only include reaction time variables
data_rt <- data[,c(1, 5, 8, 10, 12, 14)]

#Summary of the reaction time data
summary(data_rt)
##   participant    original_estimate_rt  surprise_rt        recognition_rt     
##  Min.   :26295   Min.   :7.000e+00    Min.   :2.600e+01   Min.   :5.400e+01  
##  1st Qu.:31669   1st Qu.:2.602e+04    1st Qu.:1.564e+04   1st Qu.:1.810e+04  
##  Median :31768   Median :4.408e+04    Median :2.572e+04   Median :3.133e+04  
##  Mean   :31744   Mean   :2.050e+07    Mean   :1.167e+07   Mean   :1.966e+07  
##  3rd Qu.:32152   3rd Qu.:6.606e+04    3rd Qu.:3.853e+04   3rd Qu.:4.621e+04  
##  Max.   :32304   Max.   :6.850e+09    Max.   :6.018e+09   Max.   :6.681e+09  
##                  NA's   :2025         NA's   :2025        NA's   :2025       
##  estimate_day2_rt familiaityr_rt  
##  Min.   :     6   Min.   :    11  
##  1st Qu.: 20955   1st Qu.:  9265  
##  Median : 35774   Median : 15362  
##  Mean   : 45372   Mean   : 16827  
##  3rd Qu.: 55443   3rd Qu.: 21562  
##  Max.   :821424   Max.   :650374  
##  NA's   :675      NA's   :675


Almost all values are incredibly high, so we then investigate this for one of the reaction time variables per participant to figure out why this is the case.

data_rt$mean <- rowSums(data[,2:6])

data_rt_mean <- data_rt %>%
                   group_by(participant) %>%
                   summarise(sum_rt = sum(mean, na.rm = TRUE),
                   mean_rt = sum_rt/60)

print(data_rt_mean, n = 45)
## # A tibble: 45 x 3
##    participant      sum_rt    mean_rt
##          <int>       <dbl>      <dbl>
##  1       26295     2914733     48579.
##  2       30234     2039983     34000.
##  3       31515     3864011     64400.
##  4       31534     4908457     81808.
##  5       31538     2832596     47210.
##  6       31564     2643483     44058.
##  7       31603     3235329     53922.
##  8       31615      278879      4648.
##  9       31623     3764620     62744.
## 10       31639     2683230     44720.
## 11       31663     4004651     66744.
## 12       31669     2618245     43637.
## 13       31676     3333360     55556 
## 14       31683     2716099     45268.
## 15       31687     2541064     42351.
## 16       31692     3514231     58571.
## 17       31710     3227291     53788.
## 18       31712     4256665     70944.
## 19       31718     3888464     64808.
## 20       31721     3387191     56453.
## 21       31722     2575887     42931.
## 22       31757      228525      3809.
## 23       31768     4667866     77798.
## 24       31770      244339      4072.
## 25       31771 55206116471 920101941.
## 26       31775      600577     10010.
## 27       32054     2788993     46483.
## 28       32084           0         0 
## 29       32111     2826740     47112.
## 30       32114     3973952     66233.
## 31       32129     3793761     63229.
## 32       32137      387189      6453.
## 33       32145     4508384     75140.
## 34       32152      409212      6820.
## 35       32191     2308880     38481.
## 36       32194     5132443     85541.
## 37       32256     3608947     60149.
## 38       32259     4955953     82599.
## 39       32271     2695602     44927.
## 40       32278     4700447     78341.
## 41       32279     2728308     45472.
## 42       32286     3544283     59071.
## 43       32292     3448673     57478.
## 44       32298     2755662     45928.
## 45       32304     3696199     61603.


From this, we can see that participant 31771 has reaction times that are way higher than expected, with a mean reaction time of 920101941 ms, which is over 10 days. We expect that this is due to a software issue, and thus remove this participant from further analysis. Furthermore, participant 32084 has a sum of 0, which is also strange. We will come back to this participant later.

We then look at what items are more often indicated as implausible by participants.

#Table of plausibility per item
table(data_60$item, data_60$implausibility)
##     
##       0  1
##   1  41  3
##   2  41  3
##   3  40  4
##   4  35  9
##   5  42  2
##   6  39  5
##   7  40  4
##   8  35  9
##   9  32 12
##   10 41  3
##   11 43  1
##   12 41  3
##   13 38  6
##   14 34 10
##   15 32 12
##   16 32 12
##   17 21 23
##   18 43  1
##   19 41  3
##   20 25 19
##   21 43  1
##   22 43  1
##   23 36  8
##   24 32 12
##   25 31 13
##   26 13 31
##   27 37  7
##   28 31 13
##   29 33 11
##   30 35  9
##   31 38  6
##   32 23 21
##   33 41  3
##   34 23 21
##   35 36  8
##   36 39  5
##   37 37  7
##   38 40  4
##   39 40  4
##   40 23 21
##   41 43  1
##   42 35  9
##   43 32 12
##   44 33 11
##   45 23 21
##   46 37  7
##   47 36  8
##   48  9 35
##   49 39  5
##   50 37  7
##   51 39  5
##   52 36  8
##   53 30 14
##   54 40  4
##   55 27 17
##   56 36  8
##   57 25 19
##   58 31 13
##   59 41  3
##   60 22 22


There are a few variables that are very likely to be implausible as indicated by participants, such as item 48 (1 out of 10 Wikipedia authors is a man). However, the total amount of participants (sum of 0 and 1 per item) only add up to 44, while we have 45 participants. Therefore, we look into participants’ individual plausibility data.

#Table of plausibility per participant
table(data_60$participant, data_60$implausibility)
##        
##          0  1
##   26295 48 12
##   30234 49 11
##   31515 46 14
##   31534 43 17
##   31538 46 14
##   31564 49 11
##   31603 49 11
##   31615 42 18
##   31623 28 32
##   31639 47 13
##   31663 56  4
##   31669 48 12
##   31676 46 14
##   31683 42 18
##   31687 38 22
##   31692 59  1
##   31710 49 11
##   31712 48 12
##   31718 41 19
##   31721 41 19
##   31722 45 15
##   31757 44 16
##   31768 46 14
##   31770 45 15
##   31771 54  6
##   31775 43 17
##   32054 34 26
##   32084  0  0
##   32111 47 13
##   32114 38 22
##   32129 45 15
##   32137 59  1
##   32145 53  7
##   32152 35 25
##   32191 57  3
##   32194 46 14
##   32256 55  5
##   32259 52  8
##   32271 48 12
##   32278 50 10
##   32279 46 14
##   32286 50 10
##   32292 49 11
##   32298 51  9
##   32304 54  6


It looks like one participant (32084) is missing any data on implausibility, which happens if the participant never indicates that a fact is implausible, as the variable doesn’t get created then. Therefore, we remove this participant from further analysis.

#We now remove the two aforementioned participants
data <- data[data$participant != "31771" & data$participant != "32084",]


Finally, we create frequency plots for both surprise and plausibility, to investigate how many times participants indicated that a fact was surprising or implausible.

plot_surprise <- ggplot(data = subset(data, !is.na(surprise)), aes(x = surprise)) +
  geom_bar(fill = 'white', color = 'black') 


plot_implausible <- ggplot(data = subset(data, !is.na(implausibility)), aes(x = implausibility)) +
  geom_bar(fill = 'white', color = 'black') 

ggarrange(plot_surprise, plot_implausible,
          labels = c("Surprise", "Implausible"),
          ncol = 2, nrow = 1)


## Surprise measure

While previous research often uses violation of expectancy (the error score) to calculate a surprise rating, we propose a different, more direct method of assessing surprise. We asked participants directly to indicate how surprised they were, on a scale of 1-5, resulting in a more meta-cognitive measure.

Firstly, we will calculate the more traditional violation of expectancy as a measure of surprise, and then correlate it to our direct measure of surprise. We expect these two measures to correlate moderately to highly. Facts are defined as violating the expectancy if the given_answer (the answer we give them) and original_estimate (the estimate participants made) differ by 2 or more.

We use the Phi coefficient to measure the correlation, which is calculated by using the formula for Pearson correlation over two binary variables.

##Create violation of expectancy measure as a dummy variable
#First: create a variable with the absolute difference of given_answer and original_estimate to compute an error score
data$errorscore_baseline <- abs(data$original_estimate - data$given_answer)

#Then: create binary variable where 0 = no violation of expectancy and 1 = violation of expectancy
data$errorscore_violation <- ifelse(data$errorscore_baseline >= 2, 1, 0)

#Quick plot to see the amount of facts that violated the expectancy (= high surprise)
plot_violation <- ggplot(data = subset(data, !is.na(errorscore_violation)), aes(x = errorscore_violation)) +
  geom_bar(fill = 'white', color = 'black') 

plot_violation

#From plot: about 2/3 of facts violated the expectancy

##Now create a measure for our surprise rating
#Create binary variable for low/high surprise, where participants indicated how surprised they were
#1-3 = low surprise (coded as 0), 4-5 = high surprise (coded as 1)
data$binary_surprise  <- ifelse(data$surprise <= 3, 0, 1)

#Check the correlation between the two surprise measures
#Using multiple methods as safety check that they do not differ too much, but we will use Pearson for analysis
cor.test(data$errorscore_baseline, data$surprise, method = c("pearson", "kendall", "spearman"))
## 
##  Pearson's product-moment correlation
## 
## data:  data$errorscore_baseline and data$surprise
## t = 63.964, df = 2578, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  0.7678564 0.7977167
## sample estimates:
##       cor 
## 0.7832378
#0.783, high correlation

#Check the correlation between the surprise rating and implausibility
cor.test(data$implausibility, data$surprise, method = c("pearson", "kendall", "spearman"))
## 
##  Pearson's product-moment correlation
## 
## data:  data$implausibility and data$surprise
## t = 35.636, df = 2578, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  0.5480445 0.5997780
## sample estimates:
##       cor 
## 0.5744847
#0.574, moderate to high correlation

Based on the correlation of 0.783, we can conclude that there is a strong positive relationship between the traditional violation of expectancy-based surprise measure and the metacognitive surprise measure introduced by us. We will therefore continue using our surprise measure for further analyses.

Creating and centering variables

To create a within-person measure of surprise, we will center surprise around a within-person mean. We will do the same for error scores for the baseline and the recognition condition.

#First: Create binary error scores, where any deviation from the original given_answer = 0 (incorrect) and no deviation = 1 (correct)
data$errorscore_baseline_bin <- ifelse(data$errorscore_baseline == 0, 1, 0)

data$errorscore_recognition <- abs(data$recognition_answer - data$given_answer)
data$errorscore_recognition_bin <- ifelse(data$errorscore_recognition == 0, 1, 0)

#Then: split up the dataset so we only take the facts from day 1
data_day1 <- subset(data, !is.na(surprise))

#Compute mean surprise scores per participant, and combine it with the existing dataframe
data_centered <- data_day1 %>%
  group_by(participant) %>%
  summarize(surprise_mean = mean(surprise), errorscore_baseline_mean = mean(errorscore_baseline)) %>%
  left_join(data_day1, by = c("participant"))

#Compute centered scores
data_centered$surprise_centered <- data_centered$surprise - data_centered$surprise_mean
data_centered$errorscore_baseline_centered <- data_centered$errorscore_baseline - data_centered$errorscore_baseline_mean

#Getting some descriptives
summary(data_centered[c("surprise_centered", "errorscore_baseline_centered")])
##  surprise_centered  errorscore_baseline_centered
##  Min.   :-2.56667   Min.   :-3.0500             
##  1st Qu.:-1.55000   1st Qu.:-1.4667             
##  Median : 0.04167   Median :-0.4167             
##  Mean   : 0.00000   Mean   : 0.0000             
##  3rd Qu.: 1.28333   3rd Qu.: 1.2500             
##  Max.   : 2.68333   Max.   : 6.4333
summary(data[c("surprise", "errorscore_baseline")])
##     surprise     errorscore_baseline
##  Min.   :1.000   Min.   :0.000      
##  1st Qu.:1.000   1st Qu.:1.000      
##  Median :3.000   Median :2.000      
##  Mean   :2.871   Mean   :2.604      
##  3rd Qu.:4.000   3rd Qu.:4.000      
##  Max.   :5.000   Max.   :9.000      
##  NA's   :1935    NA's   :1935

Descriptive statistics

To inspect correlations between variables, we created Pearson’s correlation matrices for the independent variables of interest (surprise, participant, plausibility, and item) and outcome measures (recall, recognition, and updating).

data_descriptives <- data[, c("surprise", "participant", "item", "implausibility", "recognition_answer",
                              "familiarity_rating", "estimate_day2")]

cor(data_descriptives, use = "pairwise.complete.obs")
##                       surprise   participant          item implausibility
## surprise            1.00000000  2.481822e-02  1.667782e-01    0.574484679
## participant         0.02481822  1.000000e+00 -7.689365e-21   -0.009841641
## item                0.16677822 -7.689365e-21  1.000000e+00    0.116656085
## implausibility      0.57448468 -9.841641e-03  1.166561e-01    1.000000000
## recognition_answer -0.02628355  1.619594e-02 -2.076335e-02   -0.003508043
## familiarity_rating  0.07349750 -9.086395e-03 -8.349992e-01    0.090867942
## estimate_day2       0.01349821  8.334982e-03 -6.088873e-02    0.036582389
##                    recognition_answer familiarity_rating estimate_day2
## surprise                 -0.026283550        0.073497498   0.013498210
## participant               0.016195939       -0.009086395   0.008334982
## item                     -0.020763348       -0.834999234  -0.060888726
## implausibility           -0.003508043        0.090867942   0.036582389
## recognition_answer        1.000000000        0.023725937   0.668395509
## familiarity_rating        0.023725937        1.000000000   0.048861425
## estimate_day2             0.668395509        0.048861425   1.000000000

Hypothesis 1

Our first hypothesis concerns the immediate recall: how well do participants remember the numerical fact they were just presented with? For example, if the participant estimated that 2 out of 10 bus drivers are women, but we tell them that 8 out of 10 bus drivers are women, the participant here has to indicate 8 (the given_answer).

The research question, then, is “do surprise and plausibility influence performance on an immediate recall test?”. We expect that facts that are surprising but still plausible are remembered better than facts that are implausible or facts that are not surprising. Thus, we expect the difference between the answer given by us (given_answer) and the participant’s estimate of that given_answer (recognition_answer) to be smaller in surprising but plausible facts.

Since we cannot include surprise and plausibility in the same model due to multicollinearity, we will first investigate the role of surprise on immediate recall. There are two ways to code immediate recall in the model: (a) correct/incorrect, where an answer is only taken as correct if the participant fills in the exact number we presented them with, or (b) the absolute error, where we take the absolute difference between the answer participant filled in and the number we presented them with. To create the most complete picture, we will look into both of these ways. We will then create a model to investigate the role of plausibility on immediate recall in facts that are surprising only.

#Linear + quadratic effect of surprise on immediate recall as binary variable
model1 <- glmer(errorscore_recognition_bin ~ 1 + surprise_centered + I(surprise_centered^2) + (1 | participant) + (1 | item), data = data_centered, family = binomial)

#Summary of the model
summary(model1)
## Generalized linear mixed model fit by maximum likelihood (Laplace
##   Approximation) [glmerMod]
##  Family: binomial  ( logit )
## Formula: 
## errorscore_recognition_bin ~ 1 + surprise_centered + I(surprise_centered^2) +  
##     (1 | participant) + (1 | item)
##    Data: data_centered
## 
##      AIC      BIC   logLik deviance df.resid 
##   3351.2   3380.4  -1670.6   3341.2     2575 
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -1.2884 -0.7540 -0.6233  1.1293  2.5756 
## 
## Random effects:
##  Groups      Name        Variance Std.Dev.
##  item        (Intercept) 0.1404   0.3747  
##  participant (Intercept) 0.1044   0.3232  
## Number of obs: 2580, groups:  item, 60; participant, 43
## 
## Fixed effects:
##                        Estimate Std. Error z value Pr(>|z|)    
## (Intercept)            -0.69916    0.09961  -7.019 2.24e-12 ***
## surprise_centered       0.02260    0.03018   0.749   0.4539    
## I(surprise_centered^2)  0.06017    0.02631   2.287   0.0222 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Correlation of Fixed Effects:
##             (Intr) srprs_
## srprs_cntrd  0.018       
## I(srprs_^2) -0.582 -0.040
#Table of significance effects
tab_model(model1)
  errorscore_recognition_bin
Predictors Odds Ratios CI p
(Intercept) 0.50 0.41 – 0.60 <0.001
surprise centered 1.02 0.96 – 1.09 0.454
surprise centered^2 1.06 1.01 – 1.12 0.022
Random Effects
σ2 3.29
τ00 item 0.14
τ00 participant 0.10
ICC 0.07
N participant 43
N item 60
Observations 2580
Marginal R2 / Conditional R2 0.003 / 0.072
#Significance testing
Anova(model1)
## Analysis of Deviance Table (Type II Wald chisquare tests)
## 
## Response: errorscore_recognition_bin
##                         Chisq Df Pr(>Chisq)  
## surprise_centered      0.5608  1    0.45394  
## I(surprise_centered^2) 5.2290  1    0.02221 *
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#Simple plot
emmip(model1,  ~ surprise_centered,at = list(surprise_centered=c(-3:3)),CIs = TRUE,type ="scale")

##Creating a pretty plot
#Saving effect sizes into a data frame
ESmodel1 <- effect(term = "surprise_centered", mod = model1)
ESmodel1 <- as.data.frame(ESmodel1)

summary(ESmodel1)
##  surprise_centered      fit               se              lower       
##  Min.   :-3.000    Min.   :0.3323   Min.   :0.02062   Min.   :0.2905  
##  1st Qu.:-1.000    1st Qu.:0.3404   1st Qu.:0.02101   1st Qu.:0.3012  
##  Median : 0.060    Median :0.3506   Median :0.02210   Median :0.3106  
##  Mean   : 0.012    Mean   :0.3890   Mean   :0.03421   Mean   :0.3240  
##  3rd Qu.: 1.000    3rd Qu.:0.4439   3rd Qu.:0.05313   3rd Qu.:0.3418  
##  Max.   : 3.000    Max.   :0.4775   Max.   :0.05416   Max.   :0.3758  
##      upper       
##  Min.   :0.3770  
##  1st Qu.:0.3819  
##  Median :0.3928  
##  Mean   :0.4568  
##  3rd Qu.:0.5510  
##  Max.   :0.5812
#Aggregating raw data for plotting results on a group level
data_centered_agg <- aggregate(errorscore_recognition_bin ~ participant + surprise_centered + surprise_centered^2,
                               data = data_centered, mean, na.action = na.omit)

#Plotting estimates
plot_model1 <- ggplot() +
  #Aggregated values of the raw data (black dots)
  geom_point(data = data_centered_agg, aes(x = surprise_centered, y = errorscore_recognition_bin)) + 
  #Values of model estimates (blue dots)
  geom_point(data = ESmodel1, aes(x = surprise_centered, y = fit), color = "blue") +
  #Line of model estimates (blue line)
  geom_line(data = ESmodel1, aes(x = surprise_centered, y = fit), color = "blue") +
  #Ribbon of CI limits for the model estimates (blue)
  geom_ribbon(data = ESmodel1, aes(x = surprise_centered, ymin = lower, ymax = upper), alpha = 0.3, fill = "blue") +
  #Labels to increase understanding
  labs(x = "Surprise (centered)", y = "Probability error score recognition")

plot_model1


We find a significant quadratic effect (P = 0.022).

#Linearity
plot(resid(model1), data_centered$errorscore_recognition)

#Normality of the residuals
plot_model(model1, type = "diag")
## $item
## `geom_smooth()` using formula 'y ~ x'

## 
## $participant
## `geom_smooth()` using formula 'y ~ x'

#Homogeneity of variance
plot(model1)

#First: subsetting the data to only include highly surprising items (surprise = 4 or 5)
data_extreme <- subset(data_centered, surprise > 3)

#Model of plausibility on immediate recall as binary variable
model2 <- glmer(errorscore_recognition_bin ~ 1 + implausibility + (1 | participant) +(1 | item), data = data_extreme, family = binomial)

#Summary of the model
summary(model2)
## Generalized linear mixed model fit by maximum likelihood (Laplace
##   Approximation) [glmerMod]
##  Family: binomial  ( logit )
## Formula: errorscore_recognition_bin ~ 1 + implausibility + (1 | participant) +  
##     (1 | item)
##    Data: data_extreme
## 
##      AIC      BIC   logLik deviance df.resid 
##   1394.6   1414.4   -693.3   1386.6     1040 
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -1.1036 -0.7978 -0.6514  1.0859  2.0098 
## 
## Random effects:
##  Groups      Name        Variance Std.Dev.
##  item        (Intercept) 0.06841  0.2615  
##  participant (Intercept) 0.18231  0.4270  
## Number of obs: 1044, groups:  item, 60; participant, 43
## 
## Fixed effects:
##                Estimate Std. Error z value Pr(>|z|)    
## (Intercept)    -0.41397    0.12151  -3.407 0.000657 ***
## implausibility -0.07504    0.13966  -0.537 0.591058    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Correlation of Fixed Effects:
##             (Intr)
## implausblty -0.556
#Table of significance effects
tab_model(model2)
  errorscore_recognition_bin
Predictors Odds Ratios CI p
(Intercept) 0.66 0.52 – 0.84 0.001
implausibility 0.93 0.71 – 1.22 0.591
Random Effects
σ2 3.29
τ00 item 0.07
τ00 participant 0.18
ICC 0.07
N participant 43
N item 60
Observations 1044
Marginal R2 / Conditional R2 0.000 / 0.071
#Significance testing
Anova(model2)
## Analysis of Deviance Table (Type II Wald chisquare tests)
## 
## Response: errorscore_recognition_bin
##                 Chisq Df Pr(>Chisq)
## implausibility 0.2887  1     0.5911
#Simple plot
emmip(model2, ~ implausibility, CIs = TRUE, type = "scale")

##Creating a pretty plot
#Saving effect sizes into a data frame
ESmodel2 <- effect(term = "implausibility", mod = model2)
ESmodel2 <- as.data.frame(ESmodel2)

summary(ESmodel2)
##  implausibility      fit               se              lower       
##  Min.   :0.0    Min.   :0.3801   Min.   :0.02401   Min.   :0.3247  
##  1st Qu.:0.2    1st Qu.:0.3837   1st Qu.:0.02591   1st Qu.:0.3340  
##  Median :0.5    Median :0.3890   Median :0.02606   Median :0.3425  
##  Mean   :0.5    Mean   :0.3890   Mean   :0.02686   Mean   :0.3379  
##  3rd Qu.:0.8    3rd Qu.:0.3944   3rd Qu.:0.02911   3rd Qu.:0.3431  
##  Max.   :1.0    Max.   :0.3980   Max.   :0.02923   Max.   :0.3449  
##      upper       
##  Min.   :0.4358  
##  1st Qu.:0.4370  
##  Median :0.4388  
##  Mean   :0.4428  
##  3rd Qu.:0.4461  
##  Max.   :0.4562
#Aggregating raw data for plotting results on a group level
data_centered_agg_implausibility <- aggregate(errorscore_recognition_bin ~ participant + implausibility,
                                              data = data_extreme, mean, na.action = na.omit)

#Plotting estimates
plot_model2 <- ggplot() +
  #Aggregated values of the raw data (black dots)
  geom_point(data = data_centered_agg_implausibility, aes(x = implausibility, y = errorscore_recognition_bin)) + 
  #Values of model estimates (blue dots)
  geom_point(data = ESmodel2, aes(x = implausibility, y = fit), color = "blue") +
  #Line of model estimates (blue line)
  geom_line(data = ESmodel2, aes(x = implausibility, y = fit), color = "blue") +
  #Ribbon of CI limits for the model estimates (blue)
  geom_ribbon(data = ESmodel2, aes(x = implausibility, ymin = lower, ymax = upper), alpha = 0.3, fill = "blue") +
  #Labels to increase understanding
  labs(x = "Implausibility", y = "Probability error score recognition")

plot_model2

#Linearity
plot(resid(model2), data_extreme$errorscore_recognition)

#Normality of the residuals
plot_model(model2, type = "diag")
## $item
## `geom_smooth()` using formula 'y ~ x'

## 
## $participant
## `geom_smooth()` using formula 'y ~ x'

#Homogeneity of variance
plot(model2)


Hypothesis 2

The second analysis concerns delayed recognition: how well can remember whether they have seen facts before? To test this, we implemented a second test moment two days after the original study. We presented participants with 90 facts, 45 of which they had seen before, the other 45 being new facts. Participants then had to indicate (a) how sure they are that they’ve seen the fact before on a Likert scale of 1-5 where 1 = definitely not seen before and 5 = definitely seen before, as well as (b) what they believe the correct answer is. For this hypothesis, we are interested in part A.


Firstly, we split up the dataset to only include facts that were presented on both days, and create the necessary variables for further analyses.

#Splitting up the data to only include the 45 items that were included on day 2 
data_45<- subset(data,familiarity == "1")

#Centering this data
#compute mean scores for surprise and baseline error and combine withe data frame
data_45_centered <- data_45 %>%
  group_by(participant) %>%
  summarize(surprise_mean = mean(surprise), errorscore_baseline_mean = mean(errorscore_baseline)) %>%
  left_join(data_45, by = c("participant"))

#Compute centered scores
data_45_centered$surprise_centered <- data_45_centered$surprise - data_45_centered$surprise_mean
data_45_centered$errorscore_baseline_centered <- data_45_centered$errorscore_baseline - data_45_centered$errorscore_baseline_mean

#Code the recognition variable
#We take recognition = 5 as the participant is sure they've seen the fact before, and 1-4 as the participant is not sure that they have seen it before
#All of the items in this subset have been seen before, so 5 = correct and 1-4 = incorrect
#Thus: 1-4 = 0 (incorrect), 5 = 1 (correct)
data_45_centered$recognition_bin <- ifelse(data_45_centered$familiarity_rating == 5, 1, 0)

We can then perform the analysis to investigate whether surprise influences delayed recognition.

#Linear + quadratic effect of surprise on immediate recall as binary variable
model3 <- glmer(recognition_bin ~ 1 + surprise_centered + I(surprise_centered^2) + (1 | participant) + (1 | item), data = data_45_centered, family = binomial)

#Summary of the model
summary(model3)
## Generalized linear mixed model fit by maximum likelihood (Laplace
##   Approximation) [glmerMod]
##  Family: binomial  ( logit )
## Formula: recognition_bin ~ 1 + surprise_centered + I(surprise_centered^2) +  
##     (1 | participant) + (1 | item)
##    Data: data_45_centered
## 
##      AIC      BIC   logLik deviance df.resid 
##   1186.0   1213.8   -588.0   1176.0     1930 
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -6.0543  0.0731  0.1528  0.3157  2.6124 
## 
## Random effects:
##  Groups      Name        Variance Std.Dev.
##  item        (Intercept) 0.642    0.8012  
##  participant (Intercept) 4.571    2.1379  
## Number of obs: 1935, groups:  item, 45; participant, 43
## 
## Fixed effects:
##                        Estimate Std. Error z value Pr(>|z|)    
## (Intercept)             3.05918    0.39678   7.710 1.26e-14 ***
## surprise_centered       0.08214    0.06074   1.352    0.176    
## I(surprise_centered^2) -0.01355    0.05034  -0.269    0.788    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Correlation of Fixed Effects:
##             (Intr) srprs_
## srprs_cntrd -0.012       
## I(srprs_^2) -0.284  0.093
#Table of significance effects
tab_model(model3)
  recognition_bin
Predictors Odds Ratios CI p
(Intercept) 21.31 9.79 – 46.38 <0.001
surprise centered 1.09 0.96 – 1.22 0.176
surprise centered^2 0.99 0.89 – 1.09 0.788
Random Effects
σ2 3.29
τ00 item 0.64
τ00 participant 4.57
ICC 0.61
N participant 43
N item 45
Observations 1935
Marginal R2 / Conditional R2 0.002 / 0.614
#Significance testing
Anova(model3)
## Analysis of Deviance Table (Type II Wald chisquare tests)
## 
## Response: recognition_bin
##                         Chisq Df Pr(>Chisq)
## surprise_centered      1.8288  1     0.1763
## I(surprise_centered^2) 0.0724  1     0.7879
#Simple plot
emmip(model3,  ~ surprise_centered,at = list(surprise_centered=c(-3:3)),CIs = TRUE,type ="scale")

##Creating a pretty plot
#Saving effect sizes into a data frame
ESmodel3 <- effect(term = "surprise_centered", mod = model3)
ESmodel3 <- as.data.frame(ESmodel3)

summary(ESmodel3)
##  surprise_centered      fit               se              lower       
##  Min.   :-3.00     Min.   :0.9365   Min.   :0.01570   Min.   :0.8394  
##  1st Qu.:-1.00     1st Qu.:0.9509   1st Qu.:0.01712   1st Qu.:0.8906  
##  Median :-0.10     Median :0.9548   Median :0.01822   Median :0.9001  
##  Mean   :-0.02     Mean   :0.9521   Mean   :0.02074   Mean   :0.8901  
##  3rd Qu.: 1.00     3rd Qu.:0.9580   3rd Qu.:0.02119   3rd Qu.:0.9066  
##  Max.   : 3.00     Max.   :0.9602   Max.   :0.03148   Max.   :0.9139  
##      upper       
##  Min.   :0.9765  
##  1st Qu.:0.9765  
##  Median :0.9787  
##  Mean   :0.9796  
##  3rd Qu.:0.9800  
##  Max.   :0.9862
#Aggregating raw data for plotting results on a group level
data_45_centered_agg <- aggregate(recognition_bin ~ participant + surprise_centered + surprise_centered^2,
                               data = data_45_centered, mean, na.action = na.omit)

#Plotting estimates
plot_model3 <- ggplot() +
  #Aggregated values of the raw data (black dots)
  geom_point(data = data_45_centered_agg, aes(x = surprise_centered, y = recognition_bin)) + 
  #Values of model estimates (blue dots)
  geom_point(data = ESmodel3, aes(x = surprise_centered, y = fit), color = "blue") +
  #Line of model estimates (blue line)
  geom_line(data = ESmodel3, aes(x = surprise_centered, y = fit), color = "blue") +
  #Ribbon of CI limits for the model estimates (blue)
  geom_ribbon(data = ESmodel3, aes(x = surprise_centered, ymin = lower, ymax = upper), alpha = 0.3, fill = "blue") +
  #Labels to increase understanding
  labs(x = "Surprise (centered)", y = "Probability error score recognition")

plot_model3

#Linearity
plot(resid(model3), data_centered$recognition)
## Warning: Unknown or uninitialised column: `recognition`.

#Normality of the residuals
plot_model(model3, type = "diag")
## $item
## `geom_smooth()` using formula 'y ~ x'

## 
## $participant
## `geom_smooth()` using formula 'y ~ x'

#Homogeneity of variance
plot(model3)

#First: subsetting the data to only include highly surprising items (surprise = 4/5)
data_45_extreme <- subset(data_45_centered, surprise > 3)

#Model of plausibility on immediate recall as binary var
model4 <- glmer(recognition_bin ~ 1 + implausibility  + (1 | participant) +(1 |item), data = data_45_extreme, family = binomial)

#Summary
summary(model4)
## Generalized linear mixed model fit by maximum likelihood (Laplace
##   Approximation) [glmerMod]
##  Family: binomial  ( logit )
## Formula: recognition_bin ~ 1 + implausibility + (1 | participant) + (1 |  
##     item)
##    Data: data_45_extreme
## 
##      AIC      BIC   logLik deviance df.resid 
##    452.2    471.0   -222.1    444.2      820 
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -5.6351  0.0889  0.1166  0.2659  1.8444 
## 
## Random effects:
##  Groups      Name        Variance Std.Dev.
##  item        (Intercept) 0.4423   0.665   
##  participant (Intercept) 5.0627   2.250   
## Number of obs: 824, groups:  item, 45; participant, 43
## 
## Fixed effects:
##                Estimate Std. Error z value Pr(>|z|)    
## (Intercept)      3.3498     0.5272   6.353 2.11e-10 ***
## implausibility   0.1165     0.3036   0.384    0.701    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Correlation of Fixed Effects:
##             (Intr)
## implausblty -0.307
tab_model(model4)
  recognition_bin
Predictors Odds Ratios CI p
(Intercept) 28.50 10.14 – 80.10 <0.001
implausibility 1.12 0.62 – 2.04 0.701
Random Effects
σ2 3.29
τ00 item 0.44
τ00 participant 5.06
ICC 0.63
N participant 43
N item 45
Observations 824
Marginal R2 / Conditional R2 0.000 / 0.626
#Significance testing
Anova(model4)
## Analysis of Deviance Table (Type II Wald chisquare tests)
## 
## Response: recognition_bin
##                 Chisq Df Pr(>Chisq)
## implausibility 0.1473  1     0.7011
#Plot
emmip(model4, ~ implausibility, CIs = TRUE, type = "scale")

##Creating a pretty plot
#Saving effect sizes into a data frame
ESmodel4 <- effect(term = "implausibility", mod = model4)
ESmodel4 <- as.data.frame(ESmodel4)

summary(ESmodel4)
##  implausibility      fit               se              lower       
##  Min.   :0.0    Min.   :0.9661   Min.   :0.01525   Min.   :0.9102  
##  1st Qu.:0.2    1st Qu.:0.9669   1st Qu.:0.01531   1st Qu.:0.9145  
##  Median :0.5    Median :0.9680   Median :0.01557   Median :0.9187  
##  Mean   :0.5    Mean   :0.9679   Mean   :0.01596   Mean   :0.9168  
##  3rd Qu.:0.8    3rd Qu.:0.9690   3rd Qu.:0.01640   3rd Qu.:0.9202  
##  Max.   :1.0    Max.   :0.9697   Max.   :0.01727   Max.   :0.9203  
##      upper       
##  Min.   :0.9876  
##  1st Qu.:0.9877  
##  Median :0.9878  
##  Mean   :0.9880  
##  3rd Qu.:0.9883  
##  Max.   :0.9889
#Aggregating raw data for plotting results on a group level
data_centered_45_implausibility <- aggregate(recognition_bin ~ participant + implausibility,
                                              data = data_45_extreme, mean, na.action = na.omit)

#Plotting estimates
plot_model4 <- ggplot() +
  #Aggregated values of the raw data (black dots)
  geom_point(data = data_centered_45_implausibility, aes(x = implausibility, y = recognition_bin)) + 
  #Values of model estimates (blue dots)
  geom_point(data = ESmodel4, aes(x = implausibility, y = fit), color = "blue") +
  #Line of model estimates (blue line)
  geom_line(data = ESmodel4, aes(x = implausibility, y = fit), color = "blue") +
  #Ribbon of CI limits for the model estimates (blue)
  geom_ribbon(data = ESmodel4, aes(x = implausibility, ymin = lower, ymax = upper), alpha = 0.3, fill = "blue") +
  #Labels to increase understanding
  labs(x = "Implausibility", y = "Probability error score recognition")

plot_model4

#Linearity
plot(resid(model4), data_45_extreme$recognition)
## Warning: Unknown or uninitialised column: `recognition`.

#Normality of the residuals
plot_model(model4, type = "diag")
## $item
## `geom_smooth()` using formula 'y ~ x'

## 
## $participant
## `geom_smooth()` using formula 'y ~ x'

#Homogeneity of variance
plot(model4)

Hypothesis 3

For the final research question, we were interested in part (b) of the aforementioned task: where participants had to indicate what they believed to be the correct answer. To investigate this, we first create two error scores: an original error score by subtracting the original estimate in block one from the provided answer, and an updating error score by subtracting the estimate in block three from the provided answer. We can then subtract the updating error score from the original error score to create a measure of whether participants updated their memory towards the presented answer.
We expect that low-surprise items often correspond to an answer that is close to the presented answer (small original error score) and high-surprise items often correspond to an answer that is further away from the presented answer (large original error score), thus we expect a linear effect between surprise and updating as there is more room for improvement on highly surprising items. However, we expect a different effect in implausible items: if participants see implausible items as an anomaly, they will reject it from their belief system, showing little updating. We thus expect to find an inverted U-shape for the effect of surprise in implausible items, and a difference in updating between high- and low-plausibility items.

We first create the necessary variables.

#Create error score variables
data_45_centered$errorscore_estimate <- abs(data_45$estimate_day2 - data_45$given_answer)
data_45_centered$errorscore_estimate_bin <- ifelse(data_45$errorscore_baseline == 0, 0, 1) #0 = no updating, 1 = some amount of updating

#Create updating variable
data_45_centered$updating <- data_45_centered$errorscore_estimate - data_45_centered$errorscore_baseline
data_45_centered$updating_bin <- ifelse(data_45_centered$updating == 0, 0, 1) #0 = no updating, 1 = some amount of updating

#Create accuracy variables
data_45_centered$accuracy_temp <- abs(data_45$estimate_day2 - data_45$given_answer)
data_45_centered$accuracy <- ifelse(data_45_centered$accuracy_temp == 0, 1, 0) #0 = not accurate, 1 = accurate

We can then perform the analyses to investigate whether surprise influences to what extent participants update their belief systems.

model5 <- lmer(updating ~ 1 + surprise_centered + I(surprise_centered^2) + (1 | participant) + (1 | item), data = data_45_centered)

#Summary
summary(model5)
## Linear mixed model fit by REML ['lmerMod']
## Formula: updating ~ 1 + surprise_centered + I(surprise_centered^2) + (1 |  
##     participant) + (1 | item)
##    Data: data_45_centered
## 
## REML criterion at convergence: 7782.6
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -3.7404 -0.6090 -0.0399  0.5782  4.1221 
## 
## Random effects:
##  Groups      Name        Variance Std.Dev.
##  item        (Intercept) 0.08644  0.2940  
##  participant (Intercept) 0.11749  0.3428  
##  Residual                3.12185  1.7669  
## Number of obs: 1935, groups:  item, 45; participant, 43
## 
## Fixed effects:
##                         Estimate Std. Error t value
## (Intercept)            -0.962525   0.095574 -10.071
## surprise_centered      -0.734839   0.029203 -25.163
## I(surprise_centered^2)  0.007669   0.025237   0.304
## 
## Correlation of Fixed Effects:
##             (Intr) srprs_
## srprs_cntrd -0.030       
## I(srprs_^2) -0.560  0.053
tab_model(model5)
  updating
Predictors Estimates CI p
(Intercept) -0.96 -1.15 – -0.78 <0.001
surprise centered -0.73 -0.79 – -0.68 <0.001
surprise centered^2 0.01 -0.04 – 0.06 0.761
Random Effects
σ2 3.12
τ00 item 0.09
τ00 participant 0.12
ICC 0.06
N participant 43
N item 45
Observations 1935
Marginal R2 / Conditional R2 0.257 / 0.302
#Significance testing
Anova(model5)
## Analysis of Deviance Table (Type II Wald chisquare tests)
## 
## Response: updating
##                           Chisq Df Pr(>Chisq)    
## surprise_centered      633.1850  1     <2e-16 ***
## I(surprise_centered^2)   0.0923  1     0.7612    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#Plot
emmip(model5, ~ surprise_centered, at = list(surprise_centered=c(-3:3)), CIs = TRUE)

##Creating a pretty plot
#Saving effect sizes into a data frame
ESmodel5 <- effect(term = "surprise_centered", mod = model5)
ESmodel5 <- as.data.frame(ESmodel5)

summary(ESmodel5)
##  surprise_centered      fit                se              lower        
##  Min.   :-3.00     Min.   :-3.0980   Min.   :0.08851   Min.   :-3.5172  
##  1st Qu.:-1.00     1st Qu.:-1.6897   1st Qu.:0.08949   1st Qu.:-1.8633  
##  Median :-0.10     Median :-0.8890   Median :0.09556   Median :-1.0764  
##  Mean   :-0.02     Mean   :-0.9171   Mean   :0.13867   Mean   :-1.1891  
##  3rd Qu.: 1.00     3rd Qu.:-0.2200   3rd Qu.:0.20608   3rd Qu.:-0.3955  
##  Max.   : 3.00     Max.   : 1.3110   Max.   :0.21373   Max.   : 0.9069  
##      upper         
##  Min.   :-2.67886  
##  1st Qu.:-1.51611  
##  Median :-0.70155  
##  Mean   :-0.64517  
##  3rd Qu.:-0.04451  
##  Max.   : 1.71516
#Aggregating raw data for plotting results on a group level
data_45_updating_agg <- aggregate(updating ~ participant + surprise_centered + surprise_centered^2,
                               data = data_45_centered, mean, na.action = na.omit)

#Plotting estimates
plot_model5 <- ggplot() +
  #Aggregated values of the raw data (black dots)
  geom_point(data = data_45_updating_agg, aes(x = surprise_centered, y = updating)) + 
  #Values of model estimates (blue dots)
  geom_point(data = ESmodel5, aes(x = surprise_centered, y = fit), color = "blue") +
  #Line of model estimates (blue line)
  geom_line(data = ESmodel5, aes(x = surprise_centered, y = fit), color = "blue") +
  #Ribbon of CI limits for the model estimates (blue)
  geom_ribbon(data = ESmodel5, aes(x = surprise_centered, ymin = lower, ymax = upper), alpha = 0.3, fill = "blue") +
  #Labels to increase understanding
  labs(x = "Surprise (centered)", y = "Updating")

plot_model5

#Linearity
plot(resid(model5), data_45_centered$updating)

#Normality of the residuals
plot_model(model5, type = "diag")
## [[1]]
## `geom_smooth()` using formula 'y ~ x'

## 
## [[2]]
## [[2]]$item
## `geom_smooth()` using formula 'y ~ x'

## 
## [[2]]$participant
## `geom_smooth()` using formula 'y ~ x'

## 
## 
## [[3]]

## 
## [[4]]
## `geom_smooth()` using formula 'y ~ x'

#Homogeneity of variance
plot(model5)


Finally, we can investigate the role of plausibility in highly surprising items on updating.

#Creating extreme values dataset
data_45_extreme <- subset(data_45_centered, surprise > 3)

#Model of plausibility on updating
model6 <- lmer(updating ~ 1 + implausibility + (1 | participant) + (1 | item), data = data_45_extreme)

#Summary
summary(model6)
## Linear mixed model fit by REML ['lmerMod']
## Formula: updating ~ 1 + implausibility + (1 | participant) + (1 | item)
##    Data: data_45_extreme
## 
## REML criterion at convergence: 3564.9
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -3.4338 -0.6478 -0.0123  0.6642  3.2079 
## 
## Random effects:
##  Groups      Name        Variance Std.Dev.
##  item        (Intercept) 0.2114   0.4598  
##  participant (Intercept) 0.2760   0.5254  
##  Residual                4.0917   2.0228  
## Number of obs: 824, groups:  item, 45; participant, 43
## 
## Fixed effects:
##                Estimate Std. Error t value
## (Intercept)     -1.8666     0.1507 -12.383
## implausibility  -0.2738     0.1525  -1.795
## 
## Correlation of Fixed Effects:
##             (Intr)
## implausblty -0.505
tab_model(model6)
  updating
Predictors Estimates CI p
(Intercept) -1.87 -2.16 – -1.57 <0.001
implausibility -0.27 -0.57 – 0.03 0.073
Random Effects
σ2 4.09
τ00 item 0.21
τ00 participant 0.28
ICC 0.11
N participant 43
N item 45
Observations 824
Marginal R2 / Conditional R2 0.004 / 0.110
#Significance testing
Anova(model6)
## Analysis of Deviance Table (Type II Wald chisquare tests)
## 
## Response: updating
##                 Chisq Df Pr(>Chisq)  
## implausibility 3.2228  1    0.07262 .
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#Plot
emmip(model6, ~ implausibility, CIs = TRUE)

##Creating a pretty plot
#Saving effect sizes into a data frame
ESmodel6 <- effect(term = "implausibility", mod = model6)
ESmodel6 <- as.data.frame(ESmodel6)

summary(ESmodel6)
##  implausibility      fit               se             lower       
##  Min.   :0.0    Min.   :-2.140   Min.   :0.1301   Min.   :-2.437  
##  1st Qu.:0.2    1st Qu.:-2.086   1st Qu.:0.1379   1st Qu.:-2.356  
##  Median :0.5    Median :-2.004   Median :0.1379   Median :-2.259  
##  Mean   :0.5    Mean   :-2.004   Mean   :0.1415   Mean   :-2.281  
##  3rd Qu.:0.8    3rd Qu.:-1.921   3rd Qu.:0.1507   3rd Qu.:-2.192  
##  Max.   :1.0    Max.   :-1.867   Max.   :0.1508   Max.   :-2.162  
##      upper       
##  Min.   :-1.844  
##  1st Qu.:-1.815  
##  Median :-1.748  
##  Mean   :-1.726  
##  3rd Qu.:-1.651  
##  Max.   :-1.571
#Aggregating raw data for plotting results on a group level
data_45_extreme_agg <- aggregate(updating ~ participant + implausibility, 
                                 data = data_45_extreme, mean, na.action = na.omit)

#Plotting estimates
plot_model6 <- ggplot() +
  #Aggregated values of the raw data (black dots)
  geom_point(data = data_45_extreme_agg, aes(x = implausibility, y = updating)) + 
  #Values of model estimates (blue dots)
  geom_point(data = ESmodel6, aes(x = implausibility, y = fit), color = "blue") +
  #Line of model estimates (blue line)
  geom_line(data = ESmodel6, aes(x = implausibility, y = fit), color = "blue") +
  #Ribbon of CI limits for the model estimates (blue)
  geom_ribbon(data = ESmodel6, aes(x = implausibility, ymin = lower, ymax = upper), alpha = 0.3, fill = "blue") +
  #Labels to increase understanding
  labs(x = "Implausibility", y = "Updating")

plot_model6


Thus, we find a trend for an effect of plausibility.

#Linearity
plot(resid(model6), data_45_extreme$updating)

#Normality of the residuals
plot_model(model6, type = "diag")
## [[1]]
## `geom_smooth()` using formula 'y ~ x'

## 
## [[2]]
## [[2]]$item
## `geom_smooth()` using formula 'y ~ x'

## 
## [[2]]$participant
## `geom_smooth()` using formula 'y ~ x'

## 
## 
## [[3]]

## 
## [[4]]
## `geom_smooth()` using formula 'y ~ x'

#Homogeneity of variance
plot(model6)

Exploratory analyses

We exploratively look at the effect of plausibility on binary accuracy on the updating task. Accuracy, here, looks at whether participants filled in the provided answer on day 2.

#Model of plausibility on updating
model7 <- glmer(accuracy ~ 1+ implausibility + (1 | participant) +(1 | item), data = data_45_extreme, family = binomial)

#Summary
summary(model7)
## Generalized linear mixed model fit by maximum likelihood (Laplace
##   Approximation) [glmerMod]
##  Family: binomial  ( logit )
## Formula: accuracy ~ 1 + implausibility + (1 | participant) + (1 | item)
##    Data: data_45_extreme
## 
##      AIC      BIC   logLik deviance df.resid 
##    812.3    831.1   -402.1    804.3      820 
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -0.8688 -0.5034 -0.4282 -0.3474  2.7425 
## 
## Random effects:
##  Groups      Name        Variance Std.Dev.
##  item        (Intercept) 0.32264  0.568   
##  participant (Intercept) 0.05857  0.242   
## Number of obs: 824, groups:  item, 45; participant, 43
## 
## Fixed effects:
##                Estimate Std. Error z value Pr(>|z|)    
## (Intercept)     -1.3430     0.1620  -8.291   <2e-16 ***
## implausibility  -0.3209     0.1879  -1.708   0.0876 .  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Correlation of Fixed Effects:
##             (Intr)
## implausblty -0.519
tab_model(model7)
  accuracy
Predictors Odds Ratios CI p
(Intercept) 0.26 0.19 – 0.36 <0.001
implausibility 0.73 0.50 – 1.05 0.088
Random Effects
σ2 3.29
τ00 item 0.32
τ00 participant 0.06
ICC 0.10
N participant 43
N item 45
Observations 824
Marginal R2 / Conditional R2 0.007 / 0.110
#Significance testing
Anova(model7)
## Analysis of Deviance Table (Type II Wald chisquare tests)
## 
## Response: accuracy
##                 Chisq Df Pr(>Chisq)  
## implausibility 2.9178  1    0.08761 .
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#Plot
emmip(model7, ~ implausibility, CIs = TRUE)


Once again, we find a trend.

We then investigated the direction of updating of highly surprising items.

#Creating updating variable with direction
data_45_extreme$original_errorscore <- abs(data_45_extreme$original_estimate - data_45_extreme$given_answer)
data_45_extreme$updated_errorscore<- abs(data_45_extreme$estimate_day2 - data_45_extreme$given_answer)
  
data_45_extreme$directional_updating <- (data_45_extreme$updated_errorscore - data_45_extreme$original_errorscore) 
#If number is negative, participant came closer to the given answer

plot_updating <- ggplot(data = subset(data_45_extreme, implausibility == 1), aes(x = directional_updating)) +
  geom_bar(fill = 'white', color = 'black') +
  xlim(-10, 10)

plot_updating

plot_updating_plausible <- ggplot(data = subset(data_45_extreme, implausibility == 0), aes(x = directional_updating)) +
  geom_bar(fill = 'white', color = 'black') +
  xlim(-10, 10)

plot_updating_plausible


Thus, while not statistically significant, there seems to be a pattern/trend of participants updating their belief system to shift closer to the presented answer.

Session information

In order to increase the ease of replication, we include our session information.

#Information about the session, for easier replication
sessionInfo()
## R version 4.2.1 (2022-06-23)
## Platform: x86_64-apple-darwin17.0 (64-bit)
## Running under: macOS Big Sur ... 10.16
## 
## Matrix products: default
## BLAS:   /Library/Frameworks/R.framework/Versions/4.2/Resources/lib/libRblas.0.dylib
## LAPACK: /Library/Frameworks/R.framework/Versions/4.2/Resources/lib/libRlapack.dylib
## 
## locale:
## [1] C
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
##  [1] lattice_0.20-45 simr_1.0.6      effects_4.2-2   sjPlot_2.8.11  
##  [5] emmeans_1.7.5   forcats_0.5.1   stringr_1.4.0   purrr_0.3.4    
##  [9] readr_2.1.2     tidyr_1.2.0     tibble_3.1.7    tidyverse_1.3.1
## [13] dplyr_1.0.9     psych_2.2.5     car_3.1-0       carData_3.0-5  
## [17] ggpubr_0.4.0    ggplot2_3.3.6   lme4_1.1-30     Matrix_1.4-1   
## 
## loaded via a namespace (and not attached):
##  [1] minqa_1.2.4         colorspace_2.0-3    ggsignif_0.6.3     
##  [4] ellipsis_0.3.2      sjlabelled_1.2.0    estimability_1.4   
##  [7] parameters_0.18.2   fs_1.5.2            rstudioapi_0.13    
## [10] glmmTMB_1.1.4       farver_2.1.1        fansi_1.0.3        
## [13] mvtnorm_1.1-3       lubridate_1.8.0     xml2_1.3.3         
## [16] splines_4.2.1       mnormt_2.1.0        knitr_1.39         
## [19] sjmisc_2.8.9        jsonlite_1.8.0      nloptr_2.0.3       
## [22] ggeffects_1.1.3     pbkrtest_0.5.1      broom_1.0.0        
## [25] binom_1.1-1.1       dbplyr_2.2.1        effectsize_0.7.0.5 
## [28] compiler_4.2.1      httr_1.4.3          sjstats_0.18.1     
## [31] backports_1.4.1     assertthat_0.2.1    fastmap_1.1.0      
## [34] survey_4.1-1        cli_3.3.0           htmltools_0.5.2    
## [37] tools_4.2.1         coda_0.19-4         gtable_0.3.0       
## [40] glue_1.6.2          Rcpp_1.0.9          cellranger_1.1.0   
## [43] jquerylib_0.1.4     vctrs_0.4.1         nlme_3.1-157       
## [46] iterators_1.0.14    insight_0.18.2      xfun_0.31          
## [49] rvest_1.0.2         lifecycle_1.0.1     rstatix_0.7.0      
## [52] MASS_7.3-57         scales_1.2.0        hms_1.1.1          
## [55] parallel_4.2.1      TMB_1.9.1           RColorBrewer_1.1-3 
## [58] yaml_2.3.5          sass_0.4.1          stringi_1.7.8      
## [61] highr_0.9           bayestestR_0.12.1   plotrix_3.8-2      
## [64] boot_1.3-28         rlang_1.0.4         pkgconfig_2.0.3    
## [67] evaluate_0.15       labeling_0.4.2      cowplot_1.1.1      
## [70] tidyselect_1.1.2    plyr_1.8.7          magrittr_2.0.3     
## [73] R6_2.5.1            generics_0.1.3      RLRsim_3.1-8       
## [76] DBI_1.1.3           pillar_1.7.0        haven_2.5.0        
## [79] withr_2.5.0         mgcv_1.8-40         survival_3.3-1     
## [82] datawizard_0.5.1    abind_1.4-5         nnet_7.3-17        
## [85] performance_0.9.2   modelr_0.1.8        crayon_1.5.1       
## [88] utf8_1.2.2          tzdb_0.3.0          rmarkdown_2.14     
## [91] grid_4.2.1          readxl_1.4.0        reprex_2.0.1       
## [94] digest_0.6.29       xtable_1.8-4        numDeriv_2016.8-1.1
## [97] munsell_0.5.0       bslib_0.3.1         mitools_2.4